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Abstract 

Given a metric space on n points, an a-approximate universal algorithm for the Steiner tree 
problem outputs a distribution over rooted spanning trees such that for any subset X of vertices 
containing the root, the expected cost of the induced subtree is within an a factor of the optimal 
Steiner tree cost for X. An a-approximate differentially private algorithm for the Steiner tree 
problem takes as input a subset X of vertices, and outputs a tree distribution that induces a 
solution within an a factor of the optimal as before, and satisfies the additional property that for 
any set X' that differs in a single vertex from X, the tree distributions for X and X' are "close" 
to each other. Universal and differentially private algorithms for TSP are defined similarly. An 
a-approximate universal algorithm for the Steiner tree problem or TSP is also an a-approximate 
differentially private algorithm. It is known that both problems admit 0(logn)-approximate 
universal algorithms, and hence 0(logn)-approximate differentially private algorithms as well. 

We prove an Q([ogn) lower bound on the approximation ratio achievable for the universal 
Steiner tree problem and the universal TSP, matching the known upper bounds. Our lower bound 
for the Steiner tree problem holds even when the algorithm is allowed to output a more general 
solution of a distribution on paths to the root. This improves upon an earlier ri(logn/ loglogn) 
lower bound for the universal Steiner tree problem, and an J7(log^''^ n) lower bound for the 
universal TSP. The latter answers an open question in Hajiaghayi et al. [13]. When expressed as 
a function of the size of the input subset of vertices, say fc, our lower bounds are in fact fl(k) for 
both problems, improving upon the previously known log^^^' k lower bounds. We then show that 
whenever the universal problem has a lower bound that satisfies an additional property, it implies 
a similar lower bound for the differentially private version. Using this converse relation between 
universal and private algorithms, we establish an r2(log?i) lower bound for the differentially 
private Steiner tree and the differentially private TSP. This answers a question of Talwar [281 . Our 
results highlight a natural connection between universal and private approximation algorithms 
that is likely to have other applications. 
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1 Introduction 



Traditionally, in algorithm design one assumes that the algorithm has complete access to the input 
data which it can use unrestrictedly to output the optimal, or near optimal, solution. In many 
applications, however, this assumption does not hold and the traditional approach towards algo- 
rithms needs to be revised. For instance, let us take the problem of designing the cheapest multicast 
network connecting a hub node to a set of client nodes; this is a standard network design problem 
which has been studied extensively. Consider the following two situations. In the first setting, the 
actual set of clients is unknown to the algorithm, and yet the output multicast network must be 
"good for all" possible client sets. In the second setting, the algorithm knows the client set, however, 
the algorithm needs to ensure that the output preserves the privacy of the clients. Clearly, in both 
these settings, the traditional algorithms for network design don't suffice. 

The situations described above are instances of two general classes of problems recently studied 
in the literature. The first situation needs the design of universal or a-priori algorithms; algorithms 
which output solutions when parts of the input are uncertain or unknown. The second situation needs 
the design of differentially private algorithms; algorithms where parts of the input are controlled by 
clients whose privacy concerns constrain the behaviour of the algorithm. A natural question arises: 
how do the constraints imposed by these classes of algorithms affect their performance? 

In this paper, we study universal and differentially private algorithms for two fundamental combi- 
natorial optimization problems: the Steiner tree problem and the travelling salesman problem (TSP). 
The network design problem mentioned above corresponds to the Steiner tree problem. We resolve 
the performance question of universal and private algorithms for these two problems completely by 
giving lower bounds which match the known upper bounds. In particular, our work resolves the 
open questions of Hajiaghayi et al. [13] and Talwar [28]. Our techniques and constructions are quite 
basic, and we hope these could be applicable to other universal and private algorithms for sequencing 
and network design problems. 

Problem formulations. In both the Steiner tree problem and the TSP, we are given a metric 
space {V, c) on n vertices with a specified root vertex r G V. Given a subset of terminals, X <Z V, 
we denote the cost of the optimal Steiner tree connecting X U r by optgj,(X). Similarly, we denote 
the cost of the optimal tour connecting X L) r hy optrpgp[X). If X is known, then both optgrp(^X) 
and optrpgp(^X) can be approximated up to constant factors. 

A universal algorithm for the Steiner tree problem, respectively the TSP, does not know the 
set of terminals X, but must output a distribution V on rooted trees T, respectively tours a, 
spanning all vertices of V. Given a terminal set X, let T[X] be the minimum-cost rooted subtree 
of T which contains X. Then the cost of the universal Steiner tree algorithm on terminal set X 
is Er.j_x)[c(r[X])]. We say the universal Steiner tree algorithm is a -approximate, if for all metric 
spaces and all terminal sets X, this cost is at most a ■ optgj^{X). Similarly, given a terminal set X, 
let ax denote the order in which vertices of X are visited in a, and let c{ax) denote the cost of this 
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tour. That is, c{ax) '■= c(r, crx(l)) + Yli=i c{ax{i),(Tx{i + 1)) + c{ax{\X\),r). The cost of the 
universal TSP algorithm on set X is 'Et-(-ti[c{o'x)], and the approximation factor is defined as it is 
for the universal Steiner tree algorithm. 

A differentially private algorithm for Steiner trees and TSPs, on the other hand, knows the set 
of terminals X; however, there is a restriction on the solution that it can output. Specifically, a 
differentially private algorithm for the Steiner tree problem with privacy parameter e, returns on 
any input terminal set X a distribution T>x on trees spanning V, with the following property. Fix 
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any set of trees T, and let X' be any terminal set such that the symmetric difference of X' and X 
is exactly one vertex. Then, 




< Pr [T G n < Pr [T G 71 • exp(e) 



The cost of the algorithm on set X is E^^xix as before, and the approximation factor is 

defined as that for universal trees. Differentially private algorithms for the TSP are defined likewise. 
To gain some intuition as to why this definition preserves privacy, suppose each vertex is a user and 
controls a bit which reveals its identity as a terminal or not. The above definition ensures that even 
if a user changes its identity, the algorithm's behaviour does not change by much, and hence the 
algorithm does not leak any information about the user's identity. This notion of privacy is arguably 
the standard and strongest notion of privacy in the literature today; we point the reader to [4J for 
an excellent survey on the same. We make two simple observations; (a) any universal algorithm is 
a differentially private algorithm with e = 0, (b) if the size of the symmetric difference in the above 
definition is k instead of 1, then one can apply the definition iteratively to get ke in the exponent. 

For the Steiner tree problem, one can consider another natural and more general solution space 
for universal and private algorithms, where instead of returning a distribution on trees spanning V ^ 
the algorithm returns a distribution T> on collections of paths P := {p^ : v G V}, where each 
is a path from v to the root r. Given a single collection P, and a terminal set X, the cost of the 
solution is c{P[X]) := c{{y]y^^ E{py)), where E{py) is the set of edges in the path p^. The cost of 
the algorithm on set X is Ep4_x'[c(-P[X])]. Since any spanning tree induces an equivalent collection 
of paths, this solution space is more expressive, and as such, algorithms in this class may achieve 
stronger performance guarantees. Somewhat surprisingly, we show that this more general class of 
algorithms is no more powerful than algorithms that are restricted to output a spanning tree. 

1.1 Previous Work and Our Results. 

A systematic study of universal algorithms was initiated by Jia et al. |16j , who gave an 0(log^ n/ log log n)- 
approximate universal algorithms for both the Steiner tree problem and the TSP. Their algorithm is 
in fact deterministic and returns a single tree. Gupta et al. improved the TSP result by giving 
a single tour which is 0(log^ n)-approximate. As noted by [16], results of O |8] on probabilisti- 
cally embedding general metrics into tree metrics imply randomized 0(log n)-approximate universal 
algorithms for these problems (see Appendix [A] for details). 

Jia et al. [16] observe that a lower bound for online Steiner tree algorithms implies a lower bound 
for universal Steiner tree algorithms; thus, following the result of Imase and Waxman [15], one 
obtains a lower bound of $7(logn) for any universal Steiner tree algorithm. It is not hard to see that 
the |15j lower bound also holds for algorithms returning a collection of vertex-to-root paths. Jia et 
al. [16] explicitly leave lower bounds for the universal TSP as an open problem. Hajiaghayi et al. 



|13| make progress on this by showing an il. ^ y^log n / log log nj lower bound for universal TSP; this 

holds even in the two dimensional Euclidean metric space. [13] conjectured that for general metrics 
the lower bound should be J7(logn); in fact, they conjectured this for the shortest path metric of a 
constant degree expander. 

When the metric space has certain special properties (for instance if it is the Euclidean met- 
ric in constant dimensional space), Jia et al. [16] give an improved universal algorithms for both 
Steiner tree and TSP, which achieves an approximation factor of O(logn) for both problems. Fur- 
thermore, if the size of the terminal set X is k, their approximation factor improves to 0(logA;) - 
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a significant improvement when k <^ n. This leads to the question whether universal algorithms 
exist for these problems whose approximation factors are a non-trivial function of k alone. A k- 
approximate universal Steiner tree algorithm is trivial; the shortest path tree achieves this factor. 
This in turn implies a 2A;-approximate universal TSP algorithm. Do either of these problems admit 
an o(/c)-approximate algorithm? The constructions of [15] achieving a lower bound of r2(logn) for 
universal Steiner tree require terminal sets that are of size n^^^\ and do not rule out the possibility 
of an 0(log A;)-approximation in general. In fact, for many network optimization problems, an initial 
polylog(n) approximation bound was subsequently improved to a polylog(A;) approximation (e.g., 
sparsest cut \19\ I20j . asymmetric A;-center |25[ |T|, and more recently, the works of Moitra et al. 
[231 124] on vertex sparsifiers imply such a result for other many cut and flow problems) . It is thus 
conceivable that a polylog(/!;)-approximation could be possible for the universal algorithms as well. 

We prove r2(logn) lower bounds for the universal TSP and the Steiner tree problem, even when 
the algorithm returns vertex-to-root paths for the latter (Theorems and{J^. Furthermore, the size 
of the terminal sets in our lower bounds is 0(logn), ruling out any o{k) -universal algorithm for 
either of these problems. (Very recently, we were made aware of independent work by Gorodezky et 
al. [lOj who obtained similar lower bounds for the universal TSP problem. We make a comparison 
of the results of our work and theirs at the end of this subsection.) 

Private vs universal algorithms. The study of differentially private algorithms for combinatorial 
optimization problems is much newer, and the paper by Gupta et al. [12] gives a host of private 
algorithms for many optimization problems. Since any universal algorithm is a differentially private 
algorithm with e = 0, the above stated upper bounds for universal algorithms hold for differentially 
private algorithms as well. For the Steiner tree problem and TSP, though, no better differentially 
private algorithms are known. Talwar, one of the authors of |12], recently posed an open question 
whether a private 0(l)-approximation exists for the Steiner tree problem, even if the algorithm is 
allowed to use a more general solution space, namely, return a collection of vertex-to-root paths, 
rather than Steiner trees [28]. 

We observe that a simple but useful converse relation holds between universal and private algo- 
rithms: "strong" lower bounds for universal algorithms implies lower bounds for differentially private 
algorithms. More precisely, suppose we can show that for any universal algorithm for the Steiner 
tree problem/TSP, there exists a terminal set X, such that the probability that a tree/tour drawn 
from the distribution has cost less than a times the optimal cost is exp(— e|X|) for a certain constant 
e. Then we get an Q,{a) lower bound on the performance of any e- differentially private algorithm 
for these problems. (Corollary [1]). Note that this is a much stronger statement than merely proving 
a lower bound on the expected cost of a universal algorithm. The expected cost of a universal algo- 
rithm may be f^(a), for instance, even if it achieves optimal cost with probability 1/2, and a times 
the optimal cost with probability 1/2. In fact, none previous works mentioned above [iSl [Ml [13] 
imply strong lower bounds. The connection between strong lower bound on universal algorithms 
and lower bounds for differentially private algorithms holds for a general class of problems, and may 
serve as a useful tool for establishing lower bounds for differentially private algorithms (Section [3]). 

In contrast to previous work, all the lower bounds we prove for universal Steiner trees and TSP 
are strong in the sense defined above. Thus, as corollaries, we get lower bounds o/0(logn) on the 
performance of differentially private algorithms for Steiner tree and TSP. Since the lower bound for 
Steiner trees holds even when the algorithm returns a collection of paths, this answers the question 
of Talwar f28\l negatively. (Corollaries [1] and [2|). 

The metric spaces for our lower bounds on universal Steiner tree and TSP are shortest path 
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metrics on constant degree Ramanujan expanders. To prove the strong lower bounds on distributions 
of trees/tours, it suffices, by Yao's lemma, to construct a distributions on terminal sets such that 
any fixed tree/tour pays, with high probability, an r2(logn) times the optimum tree/tour's cost on a 
terminal set picked from the distribution. We show that a random walk, or a union of two random 
walks, suffices for the Steiner tree and the TSP case, respectively. 

Comparison of our results with J 10^ -' Gorodezky et al. [10] independently obtained an Q(logn) lower 
bound for universal TSP. Like us, the authors construct the lower bound using random walks on 
constant degree expanders. Although the result is stated for deterministic algorithms. Theorem 2 in 
their paper implies that the probability any randomized algorithm pays o(log n) times the optimum 
for a certain subset is at most a constant. Furthermore, their result also implies an Q{k) lower bound 
on the performance of a universal TSP algorithm where k is the number of terminals. 

Although |10] do not address universal Steiner tree problem directly, the Q{k) lower bound for 
universal TSP implies an Q{k) lower bound for universal Steiner tree as well, only when the algorithm 
returns spanning trees. However, this doesn't work for algorithms which return collections of vertex- 
to-root paths. Our result provides the first lower bound for the universal Steiner tree problem 
when the algorithm is allowed to return a collection of vertex-to-root paths. 

Furthermore, even though our proof idea is similar, our results are stronger since we show a 
"strong" lower bounds for the universal problems: we prove that the probability any randomized 
algorithm pays o(log n) times the optimum for a certain subset is exponentially small in the size of 
the client set. (We state the precise technical difference in Section 12.21 while describing our lower 
bound.) As stated above, strong lower bounds are necessary in our technique for proving privacy 
lower bounds. In particular, no lower bounds for differentially private Steiner tree (even for weaker 
algorithms returning spanning trees instead of vertex-to-root paths) and TSP can be deduced from 
their results. 

1.2 Related Work 

Although universal algorithms in their generality were first studied by Jia et al.[16J, the universal 
TSP on the plane was investigated by Platzman and Bartholdi [26], who showed that a certain 
space filling curve is an 0(log n)-approximate algorithm for points on the two dimensional plane. 
Bertsimas and Grigni [3] conjecture that this factor is tight, and [13] makes progress in this direction, 
although till this work, it was not known even for points in a general metric space. It is an interesting 
open question to see if our ideas could be modified for the special metric as well. 

The notion of differential privacy was developed in the regime of statistical data analysis to 
reveal statistics of a database without leaking any extra information of individual entries; the current 
adopted definition is due to Dwork et al.[6], and since its definition a large body of work has arisen 
trying to understand the strengths and limitations of this concept. We point the reader to excellent 
surveys by Dwork and others [HO [7] for a detailed treatment. Although the notion of privacy arose 
in the realm of databases, the concept is more universally applicable to algorithms where parts of the 
inputs are controlled by privacy-concerned users. Aside from the work of Gupta et al.|12] on various 
combinatorial optimization problems, algorithms with privacy constraints have been developed for 
other problems such as computational learning problems [17j . geometric clustering problems [9], 
recommendation systems [22], to name a few. 

Organization. In Section [21 we establish an Q{logn) lower bound for the universal Steiner tree 
problem and the universal TSP. As mentioned above, the lower bound for the Steiner tree problem is 
for a more general class of algorithms which return a collection of paths instead of a single tree. The 
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lower bound established are strong in the sense defined earlier, and thus give an r2(log n) lower bound 
for private Steiner tree as well as private TSP. We formalize the connection between strong lower 
bounds for universal problems and approximability of diff^erentially private variants in Section [3l 
Finally, for sake of completeness, we provide in Appendix Rl a brief description of some upper bound 
results that follow implicitly from earlier works. 

2 Lower Bound Constructions 

The metric spaces on which we obtain our lower bounds are shortest path metrics of expander 
graphs. Before exhibiting our constructions, we state a few known results regarding expanders that 
we use. An (n, d, /?) expander is a d regular, n vertex graph with the second largest eigenvalue of 
its adjacency matrix /3 < 1. The girth g is the size of the smallest cycle and the diameter A is 
the maximum distance between two vertices. A t-step random walk on an expander picks a vertex 
uniformly at random, and at each step moves to a neighboring vertex uniformly at random. 

Lemma 1. jl21f For any constant k, there exist {n,d,(3) expanders, called Ramanujan graphs, with 
d > k, (3 < girth g = ©(log n/ log d), and diameter A = 0(logn/ log d). 

Lemma 2. (Theorem 3.6, 114V Given an {n,d,f3) expander, and a subset of vertices B with \B\ = 
an, the probability that a t-step random walk remains completely inside B is at most (a + /?)*. 

Lemma 3. (Follows from Theorem 3.10, 114V Given an {n,d,/3) expander, a subset of vertices B 
with \B\ = an, and any 7,0 < 7 < 1, the probability that a t-step random walk visits more than jt 
vertices in B is at most 2* • (a + /3)'^*. 

2.1 Steiner Tree Problem 

We consider a stronger class of algorithms that are allowed to return a distribution D on collections of 
paths P := {pv : v £ V}, where each p^ is a path from v to the root r. As stated in the introduction, 
this class of algorithms captures as a special case algorithms that simply return a distribution on 
collection of spanning trees, since the latter induces a collection of paths. We prove the following 
theorem. 

Theorem 1. For any constant e > and for large enough n, there exists a metric space {V,c) on 
n vertices such that for any distribution V on collections of paths, there is a terminal set X of size 
0(logn), such that 



At a high-level, the idea underlying our proof is as follows. We choose as our underlying graph a 
Ramanujan graph G, and consider the shortest path metric induced by this graph. We show that 
for any fixed collection P of vertex-to-root paths, a terminal set generated by a random walk q of 
length 0(logn) in G has the following property with high probability: the edges on q frequently 
"deviate" from the paths in the collection P. These deviations can be mapped to cycles in G, and 
the high-girth property is then used to establish that the cost of the solution induced by P is f^(log n) 
times the optimal cost. Before proving Theorem [H we establish the following corollaries of it. 




(1) 
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Corollary 1. (a) There is no o{logn)- approximate universal Steiner tree algorithm, (h) There is 
no o{k) -approximate universal Steiner tree algorithm where k is the size of the terminal set. (c) For 
any e > 0, there is no o(logn/(l + e))- approximate private algorithm with privacy parameter e. 

Proof. The proofs of (a) and (b) are immediate by fixing e to be any constant. The universal 
algorithm pays at least Vtilogn) times the optimum with high probability, thus giving a lower bound 
of r2(log n) on the expected cost. To see (c), consider a differentially private algorithm A with privacy 
parameter e. Let T) be the distribution on the collection of paths returned by A when the terminal 
set is 0. Let X be the subset of vertices corresponding to this distribution in Theorem [TJ Let 
P := {P : c{P[X]) = o(i^) • opt52.(X)}; we know Vip^v[P ^V]< iexp(-e|X|). Let V be the 
distribution on the collection of paths returned by A when the terminal set is X. By the definition 
of e-differential privacy, we know that Prp^x>'[^ < exp(e • \X\) ■ exp(— e|X|)) < 1/2. Thus 
with probability at least 1/2, the differentially private algorithm returns a collection of path of cost 

at least Q, (jj^^ • optgq^{X), implying the lower bound. □ 
Note that the statement of Theorem [T] is much stronger than what is needed to prove the universal 
lower bounds. The proof of part (c) of the above corollary illustrates our observation that showing 
strong lower bounds for universal problems imply lower bounds for privacy problems. This holds 
more generally, and we explore this more in Section [3j We now prove of Theorem [TJ 

Proof of Theorem [TJ Consider an (n, d, /?) expander as in Lemma [1] with degree d > 2^^^~^''\ 
where K is a large enough constant. The metric {V, c) is the shortest path metric induced by this 
expander. The root vertex r is an arbitrary vertex in V. 

We now demonstrate a distribution T>' on terminal sets X such that £\X\ < Cologn, for some 
constant Co, and for any fixed collection of paths P, 



Pr 



c{P[X]) = o ( ^ ) optsAX) 



< ^exp(-Cologn). (2) 



The lemma below is essentially similar to Yao's lemma |29] used for establishing lower bounds 
on the performance of randomized algorithms against oblivious adversaries. 

Lemma 4. Existence of a distribution T>' satisfying ([2]) proves Theorem [IJ 

Proof. For brevity, denote the expression in the RHS of [5] by p. Let ttx be the probability of X in the 
distribution T>' and vrp be the probability of collection P in the distribution T>. Let £{P,X) denote 
the event c(P[X]) = o(logn/(l + e))optgj^{X). Then ^ implies that for each P in the support 

of V, we have Exesupp{V'y.£{P,x) < P- Thus, Epesupp{i?) {j2xesuMV'y-£iP,x) ^ ^^^^ 
interchanging summations, ExGsupp(D') (Epesupp{2?):f (P.x) ^p) ^ P' which imphes that there 
exists X £ supp(P') such that Fip^v[ciP[X]) = o(logn/(l + e))optgj.{X)] < i exp(-Co logn) < 
iexp(-e|X|). □ 
The distribution T>' is defined as follows. Recall that the girth and the diameter of G are denoted 
by g and A respectively, and both are Q (^^^^ ■ Consider a random walk q of t-steps in G, where 

t = g/3, and let X be the set of distinct vertices in the random walk. This defines the distribution 
on terminal sets. Note that each X in the distribution has size \X\ = 0(log n/ log d). We define Co 
later to be a constant independent of d, and thus since d is large enough, e\X\ < Cologn. 

Fix a collection of paths P. Since we use the shortest path metric of G, we may assume that 
P is a collection of paths in C as well. Let {v,vi) be the first edge on the path py, and let F := 
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{{v,vi) : V £ V} be the collection of all these first edges. The following is the crucial observation 
which gives us the lower bound. Call a walk q = (ui, . . . , ut) on t vertices good if at most t/8 of the 
edges of the form (ujjUj+i) are in F, and it contains at least t/2 distinct vertices. 

Lemma 5. Let q be a good walk of length t = g/3 and let X be the set of distinct vertices in q. 
Then c{P[X]) = n{\X\g). 

Proof. Let X' be the vertices in X which do not traverse edges in F in the random walk q. Thus 
\X'\ > \X\ - 2t/8 > \X\/2. We now claim that c{P[X']) > \X'\g/3 which proves the lemma. For 
every u £ X' , let p'^^ be the first g/3 edges in the path pu (if PuS length is smaller than g/3, p'^= pu)- 
All the p'Js are vertex disjoint: if p'^ and p'^ intersect then the union of the edges in p^, p'^ and the 
part of the walk q from v to u contains a cycle of length at most g contradicting that the girth of G 
is g. Thus, c{P[X']), which is at least c(U„ex'K) ^ l^'lf/^ > \X\g/Q. □ 
Call the set of edges F had; note that the number of bad edges is at most n. Lemma El which 
we state and prove below, implies that the probability a t-step random walk is good is at least 
(1 - Observe that this expression is (1 — exp(— Cq logn)) for a constant Co independent of 

d. Furthermore, whenever g is a good walk, the set of distinct vertices X in g are at least i/2 in 
number; therefore o^itgrp^X) < t + A = 0(|X|) since one can always connect X to r by travelling 
along q and then connecting to r. On the other hand. Lemma [5] implies that c(P[X]) = = 
^(log^) ■ °P'''S't(^) — ^(x+f ) ■ ^V^STi-X)., by our choice of d. This gives that 

V,^\c{P[X]) < o^t st{X)] < iexp(-Cologn) 

where Cq is independent of d. Thus, V satisfies implying, by Lemma [U Theorem [TJ □ 

Lemma 6. Let G he an {n,d,(3) expander where d is a large constant (> 2-*^*^*^, say) and /3 = 
Suppose we mark an arbitrarily chosen subset of n edges in G as had. Then the probability that a t 
step random walk contains at most t/8 had edges and covers at least t/2 distinct vertices is at least 

Proof. Let £i be the event that a t step random walk contains fewer than t/2 distinct vertices, and 
let £2 be the event that a t step random walk contains at least t/8 bad edges. We bound these 
probabilities separately. 



Claim 1. Pr[fi] = rf-^W. 



Proof. Partition V arbitrarily into £ = sets of size vertices each. Pr[£^i] can be bounded by 
the probability that a t step random walk visits fewer than t/2 of these sets. Since any fixed set of 
t/2 sets contains at most an := n/\fd vertices, by Lemma [21 the probability that a t step random 
walk remains inside the union of these sets is at most (3/\/d)*. By a union bound over all possible 
choices of t/2 sets, we get 

Pr[£:i] < • < (4^/d)*/2(3/^/d)* < (3/^/d)*/4 < d-'l^\ 

The last two inequalities follows since d is large enough. □ 
We now bound Pr[i52]. Call a vertex had if more than \fd incident edges are bad. Vertices and 
edges which are not bad are called good. The set of bad vertices, denoted by B, has size at most 
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an < 2n/^/d. Now consider the modification to the random walk which terminates when it visits at 
least 15t/16 good vertices and at least t vertices in all. We define two bad events for the modified 
random walk experiment. We say event £21 occurs if is the length of the modified walk is more than 
length t, and that event £22 occurs if the modified walk traverses fewer than 7t/8 good edges. 

Claim 2. Pr[£2] < Fr[£2i] + Pr[£22]. 

Proof. Observe that any walk of length exactly t which occurs with non-zero probability in the 
modified random walk, also occurs with the same probability in the original random walk. If a 
walk has at least 7t/8 good edges, then the set of these walks form a subset of walks in the original 
experiment in which £2 does not occur. So, Pr[-i(?2] > Pi'[~"?2i A ^£22] > 1 — (Pi'[<?2i] + Pi'[<?22])- D 

Claim 3. (a) Pt[£2i] < d'^^^l (h) ^^[£22] < d'^^^l 

Proof. Part (a) follows from Lemma[3] where B is the set of bad vertices having size at most 2n/Vd. 
Thus the probability a random walk of length t contains more than t/16 bad vertices is at most 
2* • (4/\/(i)*/-^^ < since d is large enough. 

For part (b), define random variables Xi, . . . ,Xi, where i = 15t/16, as follows. Each Xi takes 
a value when the random walk visits the ith. good vertex v on its path. Let / be the fraction of 
good edges incident on v. Since v is good, we know / > (1 — Now, from v if the random 

walk traverses a bad edge, set Xi = 0. If the random walk traverses a good edge, toss a coin which 
is heads with probability (1 — "^)// ^ !> fmd set Xi = 1 if the coin falls heads, else set Xi = 0. 

Firstly, note that the probability Pr[Xj = 1] = / • (1 — = (1 — 1/Vd). Secondly, note that 

the number of good edges traversed is at least Yll=i-^i- Finally, and most crucially, note that the 
Xj's are independent since the coin tosses are independent at each i. Since d is large enough, we get 

Pr[<f22] < Pr[ Xi < 7t/8] < 2^^'^^^ i^—j < d'^/^^ 

□ 

To complete the proof of Lemma [U note that the probability a t step random walk contains at most 
t/8 bad edges and consists of at least t/2 distinct vertices is Pr[-i<?i A -^£2] > 1 — (Pr((?i) + Pr[<S2]) > 
1 _ d-^(^^ , from Claims H [2] and O □ 

2.2 Traveling Salesman Problem 

We now show an O(logn) lower bound for the traveling salesman problem. In contrast to our result 
for the Steiner tree problem, the TSP result is slightly weaker result in that it precludes the existence 
of o(log n)-approximate private algorithms for arbitrarily small constant privacy parameters only. 

We remark here that a lower bound for universal TSP implies a similar lower bound for any 
universal Steiner tree algorithm which returns a distribution on spanning trees. However, this is 
not the case when the algorithm returns a collection of paths; in particular, our next theorem below 
does not imply Theorem [1] even in a weak sense, that is, even if we restrict the parameter e to be 
less than the constant Eq (see Appendix lAl for details). 

Theorem 2. There exists a metric space {V, c) and a constant eq, such that for any distribution D 
on tours a ofV, there exists a set X of size 0(logn) such that 

Pr [c(o-x) = o(logn) • optj,sp{X)] < Jexp(-eo|X|) 
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At a high level, the idea as before is to choose as our underlying graph a Ramanujan graph G, and 
consider the shortest path metric induced by this graph. We show that for any fixed permutation 
cr of vertices, with high probability a pair of random walks, say qi,q2, has the property that they 
frequently alternate with respect to a. Moreover, with high probability, every vertex on qi is l^(log n) 
distance from every vertex in q2 ■ The alternation along with large pairwise distance between vertices 
of qi and q2 implies that on input set defined by vertices of qi and q2, the cost of the tour induced 
by cr is r2(logn) times the optimal cost. 

As stated in the Introduction, Gorodezky et al. [10] also consider the shortest path metric on 
Ramanujan expanders to prove their lower bound on universal TSP. However, instead of taking 
clients from two independent random walks, they use a single random walk to obtain their set of 
'bad' vertices. Seemingly, our use of two random walks makes the proof easier, and allows us to 
make a stronger statement: the RHS in the probability claim in Theorem[2]is exponentially small in 
\X\, while [To] implies only a constant. This is not sufficient for part (c) of the following corollary. 

As in the case of Steiner tree problem, we get the following corollaries of the above theorem. 

Corollary 2. (a) There is no o{logn)- approximate universal TSP algorithm, (h) There is no o{k)- 
approximate universal TSP algorithm where k is the size of the terminal set. (c) There exists £q > Q 
such that there is no o{logn)- approximate private algorithm with privacy parameter at most eq. 

Proof of Theorem [2} In the proof below we do not optimize for the constant Eq. Using Lemma 
[H we pick an {n,d,/3) expander of diameter O(logn), where d is a constant such that /3 < 1/10. 
Let {V, c) be the corresponding metric space obtained via the shortest path metric and choose a 
vertex r as the root vertex. As in the proof of Lemma HI it suffices to construct a distribution T>' 
on subsets X of size at most Cologn/eo, for some constant Co, such that given any permutation a 
on the vertices of G, 

^Pr^,[c(f^x) < o{\ogn)optj,sp{X)] < ^ exp(-Co log n) (3) 

We construct T>' as follows. Pick a vertex uniformly at random and perform a random walk 
qi for t := steps. Let Xi be the set of vertices visited in this walk. Repeat this process 

independently to generate a second walk q2 and let X2 be the set of vertices visited in the second 
random walk. The set of vertices visited by the two walks together define our terminal set, namely, 
X = XiU X2. Note that \X\ < = 0(logn). Since the diameter of the graph is O(logn), we 

have optrpgp(X) = O(logn). This defines the distribution V. 

Let £1 be the event that the starting point of q2 is at distance at least 3t from the starting point 
of qi. Thus when the event fi occurs, each vertex in Xi is at distance at least t from any vertex in 
X2. Note that, Pr[£^i] is exactly the fraction of vertices in G which are at distance at least 3t from 
any given vertex. Since at most d^*(= n'^^^) vertices are at a distance 3t from any vertex, Pr[iSi] is 
at least (1 — n~^/^) = (1 — exp(— r2(log n)). 

We partition a into i = ^log^n blocks of length n/£ each where 7 is a constant to be specified 
later in the proof of Claim Let £2 denote the event that both qi and q2 visit at least 3^/4 blocks 
each. The claim below shows that this event occurs with high probability. 

Claim 4. Prx^v'[S2] > (1 - exp(-J^(logrfn)). 

Proof. By symmetry, it suffices to analyze the probability of the event that qi visits fewer than 3^/4 
blocks. Fix any set of 3^/4 blocks, and let B denote the union of these 3^/4 blocks. By Lemma [21 

the probability that qi remains inside B is bounded by (/3 + |) < + |) * = 2'^'^'^ i°Sd") for 
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some constant Ci > 0. Set 7 to be Ci/2. The probability that Xi visits fewer than 3^/4 blocks can 
thus be bounded by (|) • 2-(^i < 2^ • 2-(^i 1°^^") < 2-(^i i°gd")/2 = exp(-J7(logrf n)). □ 

By a union bound, we get that there exists a suitable constant C( such that Pr[ifi A 1S2] > 
(1 — ^ exp(— C{ log^n)). Observe that, when £2 occurs, then there are at least £/4 blocks which are 
visited by both qi and q2- If £1 occurs as well, then for each such block, ux pays a cost of least t 
since it visits a vertex in Xi followed by a vertex in X2, or vice- versa, and these vertices are at least 
t apart. So if both £1 and £2 occur, the cost of ax is at least tl/A = ri(log^ n), since d is a constant. 
Using the fact that o'ptrpgp^X) = O(logn), we get that, Prx^x)'[c(crx) = o(logn)opt/jn_5p(X)] < 
^ exp(— C( log^; n). We choose the constant Co := C'i/\ogd and set eq := 2C[; observe that we have 
Eol-'^l < '^2\ogd ~ C'ologra. This ends the description of V for which ([3]) holds. □ 

3 Strong Universal Lower Bounds imply Privacy Lower Bounds 

Suppose n is a minimization problem whose instances are indexed as tuples [I,X). The first 
component / represents the part of the input that is accessible to the algorithm (and is public); for 
instance, in the Steiner tree and the TSP example, this is the metric space iy, c) along with the 
identity of the root. The second component X is the part of the input which is either unknown 
beforehand, or corresponds to the private input. We assume that X is a subset of some finite universe 
U = U{I). In the Steiner tree and TSP example, X is the set of terminals which is a subset of all 
the vertices. An instance {I-,X) has a set of feasible solutions X), or simply S{X) when / is 
clear from context, and let S := \^-^^^ S{X). In the case of Steiner trees, S{X) is the collection of 
rooted trees containing X] in the case of TSP it is the set of tours spanning X U r. Every solution 
S £ S has an associated cost c(5), and opt(X) denotes the solution of minimum cost in S{X). 

We assume that the solutions to instances of 11 have the following projection property. Given any 
solution S e S{X) and any X' O X, S induces a unique solution in S{X'), denoted by irx'iS). For 
instance, in case of the Steiner tree problem, a rooted tree spanning vertices of X maps to the unique 
minimal rooted tree spanning X' . Similarly, in the TSP, an ordering of vertices in X maps to the 
induced ordering of X' . In this framework, we now define approximate universal and differentially 
private algorithms. 

An a -approximate universal algorithm for 11 takes input / and returns a distribution D over 
solutions in S{U) with the property that for any X C. U, 'Es^T>[c{'n'x{S))] < a • opt(/, X). An a- 
approximate differentially private algorithm with privacy parameter e for 11 takes as input (/, X) and 
returns a distribution Dx over solutions in UyDX'^(^) that satisfies the following two properties. 
First, for all (/, X), E5^x'x[c(''i"x(5'))] < a ■ opt(/, X). Second, for any set of solutions and for 
any pair of sets X and X' with symmetric difference exactly 1 , we have 

exp(-e) • Pr [5 € -F] < Pr [5 G ^] < exp(e) • Pr [S G ^] 

It is easy to see that any a-approximate universal algorithm is also an a-approximate differentially 
private algorithm with privacy parameter e = 0; the distribution T>x '■= ^ for every X suffices. 
We now show a converse relation: lower bounds for universal algorithms with a certain additional 
property imply lower bounds for private algorithms as well. We make this precise. 

Fix /9 : [n] — )• [0, 1] to be a non-increasing function. We say that an (a, p) lower bound holds 
for universal algorithms if there exists I with the following property. Given any distribution D on 
S{U), there exists a subset X C [7 such that 

Pv^iciTTxiS)) < a ■ opt(/,X)] < p(|X|) (4) 
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We say that the set X achieves the (a, p) lower bound. It is not hard to see that when /? is a 
constant function bounded away from 1, an (a, p) lower bound is equivalent to an Q.{a,) lower bound 
on universal algorithms. 

Theorem 3. Suppose there exists a {a,p) lower bound for universal algorithms for a problem U. 
Then any e-private algorithm for 11 with e < £o •= infx jx] ^ 2p( | x | ) ) ^^'^ approximation factor 
ofn{a). 

Proof. Let / be an instance that induces the (a, p) lower bound. Consider the output of a dif- 
ferentially private algorithm A with privacy parameter e < eq, on the input pair (/, 0). Let T> be 
the distribution on the solution set S. We first claim that all S in the support of V lie in S{U). 
Suppose not and suppose there is a solution S G S^Z) \ S{U), for some Z C U, which is returned 
with non-zero probability. By the definition of differential privacy, this solution must be returned 
with non-zero probability when A is run with (I, U), contradicting feasibility since S ^ S{U). 

Thus, V can be treated as a universal solution for n. Let X be the set which achieves the {a, p) 
lower bound for P, and let J- := {S ^ S{X) : c{S) < a ■ opt{I,X)}. By the definition of the lower 
bound, we know that Pis-i^T>[S & J^] < p{\X\). Let T>' be the output of the algorithm A when the 
input is {I,X). By definition of differential privacy, Frs^^v'lS & J-'] < exp(e • \X\) ■ p{\X\) < 1/2, 
from the choice of e. This shows a lower bound on the approximation factor of any differential 
private algorithm for H with parameter s < sq. □ 
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A Upper Bounds on Universal Algorithms 

In Section [TTTt we mention that there exist 0(log n)-approximate algorithms for the universal Steiner 
tree problem and the universal TSP. The Steiner tree result follows from the results of probabilistic 
embedding general metrics into tree metrics; this was remarked by Jia et al. [T6] and Gupta et al. |12] . 
The TSP result follows from the observation that any a-approximate Steiner tree algorithm implies 
an 2a-approximate universal TSP algorithm; this follows from a standard argument of obtaining a 
tour of from a tree of at most double the cost by performing a depth first traversal. This was noted 
by Schalekamp and Shmoys [27]. We remark here that this reduction does not hold when the Steiner 
tree algorithm is allowed to return a collection of paths; in particular, our lower bound for universal 
TSP (Theorem [21) does not imply the lower bound for universal Steiner tree algorithms which return 
path collections (Theorem [T]). For completeness, we give short proofs of the above two observations. 

Given a metric space {V,c) and any spanning tree T of V, let ct{u,v), for any two vertices 
u, V, be the cost of all the edges in the unique path connecting u and v in T. Given a distribution 
T> on spanning trees, define the stretch of a pair {u,v) to be ^^^c{u^)^''"^^ ' "^^^ stretch of V is 
max(„ j,)gyxy stretch(u, v). The following connects the stretch and the performance of this algorithm. 

Theorem 4. Suppose there exists a distribution T> on spanning trees that has stretch at most a. 
Then the distribution gives an a- approximation for the universal steiner tree problem. 

Proof. Fix any set of terminals X. Let T* be the tree which attains value opt^j.(X). Let the 
support of V be {Ti, . . . ,T£) with vTj being the probability of Tj. For every edge {u,v) E T* , let 
Pi{u, v) be the unique u, v path in Tj. Note that |J^^ v)eT* P«(^> ^) is a sub-tree of Tj which connects 

X, and thus, c(Tj[X]) < c ({J(^u v)£T* Pii'^^'^)^ — Z](ut;)eT* (^Ti{u,v)- Thus, the expected cost of the 
universal Steiner tree algorithm is 

e e i 

^Tric{Ti[X]) <^Tri ^ ct,{u,v) = ^ ^TTiCT,{u,v) = Bt^v[ct{u,v)] < a-0Tptsj,{X) 

i=l i=l {u,v)£T* {u,v)£T* i=l {u,v)£T* 

□ 
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It is known by the results of Fakcharoenphol et al. [8] that for any n vertex metric {V, c) one can 
find a a distribution P with stretch O(logn)0 This gives us the fohowing corollary. 

Corollary 3. There is an O (log n)- approximate universal Steiner tree algorithm. 

Theorem 5. An a-approximate universal Steiner tree algorithm implies a 2a -approximate universal 
TSP algorithm. 

Proof. Suppose the a-approximate universal Steiner tree algorithm returns a distribution T) on 
spanning trees. For each tree T in the support of P, consider the ordering a of the vertices obtained 
by performing a depth-first traversal of the tree. This induces a distribution on orderings, and thus 
a universal TSP algorithm. We claim this is 2a-approximate. Fix any subset X <^V and let T[X] 
be the unique minimal tree of T which spans X [Jr. Let a' be the ordering of the vertices in T[X] 
obtained on performing a depth-first traversal of 

Claim 5. The order in which a' visits vertices of X is the same order in which a visits them. 

Proof. T[X] is obtained from T by deleting a collection of sub-trees from T. Note that all the 
vertices of any sub-tree appear contiguously in any depth-first traversal order - this is because once 
the depth first traversal visits a vertex v, it traverses all vertices in the sub-tree of v before moving 
on to any other vertex not in the sub-tree of v. Therefore, deleting a sub-tree of T and performing a 
depth first traversal only removes a contiguous piece in the ordering a. The ordering of the remaining 
vertices is left unchanged. □ 
To complete the proof, we use the fact that if a is the depth first traversal order of any tree T, then 
c{av(T)) < 2c(£'(r)) - this is because any edge of T is traversed at most twice once in the forward 
direction and one reverse. Thus, c{a{X)) = c{a'{X)) < 2c{T[X]) < 2a-optgrp[X) < 2a-optj.^p(X), 
where the last inequality uses that the tour of X contains a Steiner tree of X. □ 

Corollary 4. There exists an O(logn) approximation for the universal TSP. 



^Strictly speaking, the algorithms of [8] do not return a distribution on spanning trees, but rather a distribution 
on what are known as hierarchically well-separated trees. However, it is known that with another constant factor loss, 
one can obtain an embedding onto spanning trees of V as well. See Section 5 of the paper [TS], for instance. 
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